Maximizing Modularity is hard* 
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Abstract. Several algorithms have been proposed to compute partitions of networks into 
communities that score high on a graph clustering index called modularity. While publi- 
cations on these algorithms typically contain experimental evaluations to emphasize the 
plausibility of results, none of these algorithms has been shown to actually compute optimal 
partitions. We here settle the unknown complexity status of modularity maximization by 
showing that the corresponding decision version is NP-complete in the strong sense. As a 
consequence, any efficient, i.e. polynomial-time, algorithm is only heuristic and yields sub- 
optimal partitions on many instances. 

1 Introduction 

Partioning networks into communities is a fashionable statement of the graph clustering 
problem, which has been studied for decades and whose applications abound. 

Recently, a new graph clustering index called modularity has been proposed [10]. It 
immediately prompted a number of follow-up studies concerning different applications 
and possible adjustments of the measure (see, e.g., [3,4,7,13]). Also, a wide range of 
algorithmic approaches approaches has been considered, for example based on a greedy 
agglomeration [1,8], spectral division [9,12], simulated annealing [6,11] and extremal 
optimization [2]. 

None of these algorithms, however, has been shown to be produce optimal partitions. 
While the complexity status of modularity maximization is open, it has been speculated [9] 
that it might be NP-hard due to similarity with the MAX-CUT problem. 

In this paper, we provide the first complexity-theoretic argument as to why the prob- 
lem of maximizing modularity is intractable by proving that it is NP-complete in the 
strong sense. This means that there is no correct polynomial-time algorithm to solve this 
problem for every instance unless P = NP. Therefore, all of the above algorithms even- 
tually deliver suboptimal solutions, and there is no hope for an efficient algorithm that 
computes maximum modularity partitions on all problem instances. In a sense, our result 
thus justifies the use of heuristics for modularity optimization. 

2 Modularity 

Modularity is a quality index for clusterings defined as follows. We are given a simple 
graph G = (V, E), where V is the set of vertices and E the set of (undirected) edges. If not 
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stated otherwise, n = \V\ and m = \E\ throughout. The degree deg(f) of a vertex v € V 
is the number of edges incident to v. A cluster or community C C V is a subset of the 
vertices. A clustering C = {C\, . . . , Ct} of G is a partition of V into clusters such that each 
vertex appears in exactly one cluster. With a slight disambiguation, the modularity [10] 
Q(C) of a clustering C is defined as 



where E(C, C) denotes the set of edges between vertices in clusters C and C', and E(C) = 
E(C,C). Note that C ranges over all clusters, so that edges in E(C) are counted twice 
in the squared expression. This is to adjust proportions, since edges in E(C, C), C ^ C, 
are counted twice as well, once for each order of the arguments. Note that we can rewrite 
Eq. (1) into the more convenient form 



It reveals an inherent trade-off: to maximize the first term, many edges should be contained 
in clusters, whereas minimization of the second term is achieved by splitting the graph 
into many clusters of small total degrees. In the remainder of this paper, we will make use 
of this formulation. 

3 NP-Completeness 

To formulate our complexity-theoretic result, we need to consider the following decision 
problem underlying modularity maximization. 

Problem 1 (Modularity) Given a graph G and a number K, is there a clustering C 
ofG, for which Q(C) > K? 

Note that we may ignore the fact that, in principle, K could be a real number in the range 
[0, 1], because Am 2 ■ Q(C) is integer for every partition C of G and polynomially bounded 
in the size of G. 

Note also that modularity maximization cannot be easier than the decision problem, 
because determining the maximum possible modularity index of a graph immediately 
yields an answer to the decision question. 

Our hardness result for Modularity is based on a transformation from the following 
decision problem. 

Problem 2 (3-Partition) Given 3k positive integer numbers ai, ... ,0,3k such that the 
sum Y2^=i a i = an d < a« < 6/2 for an integer b and for all i = 1, . . . , 3k, is there a 
partition of these numbers into k sets, such that the numbers in each set sum up to b? 

We will show that an instance A = {ai, ... ,03k} of 3-Partition can be transformed 
into an instance (G(A),K(A)) of Modularity, such that G(A) has a clustering with 
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modularity at least K(A), if and only if ax, ■ ■ ■ , a^k can be partitioned into k sets of sum 
b = § Ya=x a i eacn - 

It is crucial that 3-Partition is strongly NP-complete [5], i.e. the problem remains NP- 
complete even if the input is represented in unary coding. This implies that no algorithm 
can decide the problem in time polynomial even in the sum of the input values, unless 
P = NP. More importantly, it implies that our transformation need only be pseudo- 
polynomial. 

The reduction is defined as follows. From an instance A of 3-Partition, construct a 
graph G(A) with k cliques (completly connected subgraphs) Hi, ... , Hk of size a = Ya=x a i 
each. For each element ai € A we introduce a single element vertex, and connect it to a, 
vertices in each of the k cliques in such a way that each clique member is connected to 
exactly one element vertex. It is easy to see that each clique vertex then has degree a and 
the element vertex corresponding to element <2j € A has degree kat. The number of edges in 
G(A) is m = -|a(a + l). See Fig. 1 for an example. Note that the size of G(A) is polynomial 
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Fig. 1. An example graph G(A) for the instance A = {2, 2, 2, 2, 3, 3} of 3-Partition. Edge colors indicate 
edges to and within the k = 2 cliques Hi (red) and H2 (blue). Vertex labels indicate the corresponding 
numbers ai £ A. 

in the unary coding size of A, so that our transformation is indeed pseudo-polynomial. 

Before specifying bound K(A) for the instance of Modularity, we will show three 
properties of maximum modularity clusterings of G(A). Together these properties establish 
the desired characterization of solutions for 3-Partition by solutions for Modularity. 

Lemma 1. In a maximum modularity clustering ofG(A), none of the cliques Hx, ■ ■ ■ 
is split. 

Proof. We consider a clustering C that splits a clique H € {Hi, . . . , Hf.} into different 
clusters and then show how to obtain a clustering with strictly higher modularity. Suppose 



that Ci, . . . , C r G C, r > 1, are the clusters that contain vertices of i7. For i = 1, . . . , r we 
denote by 



— ni the number of vertices of contained in cluster Cj, 

— mi = \E{Ci)\ the number edges between vertices in Cj, 

— /j the number of edges between vertices of H in Ci and element vertices in Ci, 

— di be the sum of degrees of all vertices in Q. 

The contribution of Ci, . . . , C r to Q(C) is 

1 r 1 r 

— to; - - — ^ . 

m ^ Am 2 ^ 

i=l i=l 

Now suppose we create a clustering C by rearranging the vertices in C\ , . . . , C r into clusters 
C, C[, . . . , C' r , such that C contains exactly the vertices of clique H, and each 1 < i < 
r, the remaining elements of Ci (if any). In this new clustering the number of covered edges 
reduces by Y^i=i fii because all vertices from H are removed from the clusters C[. This 
labels the edges connecting the clique vertices to other non-clique vertices of Ci as inter- 
cluster edges. For H itself there are Yli=i Y^j=i+i n i n j edges that are now additionally 
covered due to the creation of cluster C. In terms of degrees the new cluster C contains 
a vertices of degree a. The sums for the remaining clusters C[ are reduced by the degrees 
of the clique vertices, as these vertices are now in C . So the contribution of these clusters 
to Q(C') is given by 
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so that 



Q(C)-Q(C) = ^(E E n ^-A + ^ 
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Using the fact that 2 Y7i=i Y^j=i+i n i n j = H T i=il2j^i n i n j-> substituting m = |a(a + 1) 
and rearranging terms we get 



Q(C') - Q(C) = [ -a 3 -2A;(a + l)^/ i + ^n i 2dj - ma + fc(a + 1) ^ ' 

i=i i=i \ j^i 



> ^ ( -a 3 - 2k(a + 1)^2 fi + ^2rii n i a + 2kf i + k{a + l) s ^n j 

i=l i=l V jyi 



For the last inequality we use the fact that di > rua + fc/j. This inequality holds because 
Ci contains at least the n« vertices of degree a from the clique H. In addition it contains 



both the clique and element vertices for each edge counted in /j. For each such edge there 
are k — 1 other edges connecting the element vertex to the k — 1 other cliques. Hence, we 
get a contribution of kfi in the degrees of the element vertices. Combining the terms rij 
and one of the terms Ylj^i n j we § e ^ 

Q(C) - Q(C) > ( -a 3 -2Ma + l)^/ 1 + ^n ! [a^n 3 + 2fc/ 1 + ((fc-l)a + < : )^ 
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For the last step we note that rii < a — 1 and nj — a — 1 < for all i = 1, . . . , r. So 
increasing /j decreases the modularity difference. For each vertex of H there is at most 
one edge to a vertex not in H, and thus /j < n^. 
By rearranging and using the fact that a > 3/c we get 

r I r 
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as we can assume k > 2 for all relevant instances of 3-Partition. This shows that any 
clustering can be improved by merging each clique completely into a cluster. This proves 
the lemma. □ 

Next, we observe that the optimum clustering places at most one clique completely 
into a single cluster. 



Lemma 2. In a maximum modularity clustering of G(A), every cluster contains at most 
one of the cliques Hi, ... , . 



Proof. Consider a maximum modularity clustering. The previous lemma shows that each 
of the k cliques Hi, . . . , is entirely contained in one cluster. Assume that there is a 
cluster C which contains at least two of the cliques. If C does not contain any element 
vertices, then the cliques form disconnected components in the cluster. In this case it is 
easy to see that the clustering can be improved by splitting C into distinct clusters, one for 
each clique. In this way we keep the number of edges within clusters the same, however, 
we reduce the squared degree sums of clusters. 

Otherwise, we assume C contains I > 1 cliques completely and in addition some element 
vertices of elements aj with j G J C {1, . . . , k}. Note that inside the I cliques |a(a — 1) 
edges are covered. In addition, for every element vertex corresponding to an element aj 
there are la,j edges included. The degree sum of the cluster is given by the la clique vertices 
of degree a and some number of element vertices of degree kaj . The contribution of C to 
Q(C) is thus given by 
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Now suppose we create C by splitting C into C[ and C' 2 such that C[ completely contains 
a single clique H. This leaves the number of edges covered within the cliques the same, 
however, all edges from H to the included element vertices eventually drop out. The degree 
sum of C[ is exactly a 2 , and so the contribution of C[ and C' 2 to Q(C') is given by 

1 (la{a -!) + (/-!) £ - ^ ^(Z - l)a 2 + k g a, j + a 4 
Considering the difference we note that 



Q(C) - Q(C) = ~ aj + ^ ( (21 - l)a 4 + 2ka 2 ^ a, - a 4 ) 
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as fc > for all instances of 3-Partition. 

Since the clustering is improved in each case, it is not optimal. This is a contradiction. 

□ 



The previous two lemmas show that any clustering can be strictly improved to a 
clustering that contains k clique clusters, such that each one completely contains one of 
the cliques Hi, ... , H^ (possibly plus some additional element vertices). In particular, this 



must hold for the optimum clustering as well. Now that we know how the cliques are 
clustered we turn to the element vertices. 

As they are not directly connected, it is never optimal to create a cluster consisting only 
of element vertices. Splitting such a cluster into singleton clusters, one for each element 
vertex, reduces the squared degree sums but keeps the edge coverage at the same value. 
Hence, such a split yields a clustering with strictly higher modularity. The next lemma 
shows that we can further strictly improve the modularity of a clustering with a singleton 
cluster of an element vertex by joining it with one of the clique clusters. 

Lemma 3. In a maximum modularity clustering of G(A), there is no cluster composed of 
element vertices only. 

Proof. Consider a clustering C of maximum modularity and suppose that there is an 
element vertex Vi corresponding to the element Oj, which is not part of any clique cluster. 
As argued above we can improve such a clustering by creating a singleton cluster C = {vi}. 
Suppose C m i n is the clique cluster, for which the sum of degrees is minimal. We know that 
Cmin contains all vertices from a clique H and eventually some other element vertices for 
elements aj with j G J for some index set J. The cluster C m i n covers 

all ate=H ed 

gcs 

within H and Yljej a j edges to element vertices. The degree sum is a 2 for clique vertices 
and k YljeJ a i ^ or element vertices. As C is a singleton cluster, it covers no edges and the 
degree sum is fccij. This yields a contribution of C and C m i n to Q(C) of 



Again, we create a different clustering C' by joining C and C rn i n to a new cluster C'. This 
increases the edge coverage by aj. The new cluster C has the sum of degrees of both 
previous clusters. The contribution of C to Q(C') is given by 



At this point recall that C m i n is the clique cluster with the minimum degree sum. For this 
cluster the elements corresponding to included element vertices can never sum to more 





so that 




than ^a. In particular, as V{ is not part of any clique cluster, the elements of vertices in 
C m in can never sum to more than \{a — a«). Thus, 

V- 1, x 1 

and so Q(C) — Q(C) > 0. This contradicts the assumption that C is optimal. □ 

We have shown that for the graphs G(A) the clustering of maximum modularity con- 
sists of exactly k clique clusters, and each element vertex belongs to exactly one of the 
clique clusters. Finally, we are now ready to state our main result. 

Theorem 3. Modularity is strongly NP-complete. 

Proof. For a given clustering C of G(A) we can check in polynomial time whether Q(C) > 
K(A), so clearly MODULARITY G NP. 

For NP-completeness we transform an instance A = {ai, . . . , 03^} of 3-Partition into 
an instance (G(A), K(A)) of Modularity. We have already outlined the construction 
of the graph G(A) above. For the correct parameter K(A) we consider a clustering in 
G(A) with the properties derived in the previous lemmas, i.e. a clustering with exactly k 
clique clusters. Any such clustering yields exactly (k — l)a inter-cluster edges, so the edge 
coverage is given by 

^ \E(C)\ _ m- (k - \)a _ 2{k-\)a_ 2k -2 

^ m m ka(a + 1) k(a + 1) 

cec* y ' v ' 

Hence, the clustering C = (C\, . . . , CjA with maximum modularity must minimize 

degld) 2 + deg(C 2 ) 2 + . . . + deg(C fe ) 2 . 

This requires to equilibrate the element vertices according to their degree as good as 
possible between the clusters. In the optimum case we can assign each cluster element 
vertices corresponding to elements that sum to b = |a. In this case the sum of degrees of 
element vertices in each clique cluster is equal to k^a = a. This yields deg(Cj) = a 2 + a 
for each clique cluster Cj, i = 1, . . . , k, and gives 

deg(Ci) 2 + . . . + deg(C k ) 2 > k{a 2 + a) 2 = ka 2 {a + l) 2 . 

Equality holds only in the case, in which an assignment of b to each cluster is possible. 
Hence, if there is a clustering C with Q(C) of at least 

jst a\ _ i 2k ~ 2 ka 2 (a + l) 2 _ (fc-l)(a-l) 
1 J k(a + l) k 2 a 2 (a + l) 2 k(a + 1) 

then we know that this clustering must split the element vertices perfectly to the k clique 
clusters. As each element vertex is contained in exactly one cluster, this yields a solution 
for the instance of 3-Partition. With this choice of K(A) the instance (G(A),K(A)) of 
Modularity is satisfiable only if the instance A of 3-Partition is satisfiable. 



Otherwise, suppose the instance for 3-Partition is satisfiable. Then there is a par- 
tition into k sets such that the sum over each set is \a. If we cluster the corresponding 
graph by joining the element vertices of each set with a different clique, we get a cluster- 
ing of modularity K{A). This shows that the instance (G(A), K(A)) of Modularity is 
satisfiable if the instance A of 3-Partition is satisfiable. This completes the reduction 
and proves the theorem. □ 

4 Conclusion 

We have shown that maximizing the popular modularity clustering index is strongly NP- 
complete. These results can be generalized to modularity in weighted graphs. We can 
consider the graph G to be completely connected and use weights of and 1 on each 
edge to indicate its presence. Instead of the numbers of edges the definition of modularity 
then employs the sum of edge weights for edges within clusters, between clusters and in 
the total graph. This yields an equivalent definition of modularity for graphs, in which 
the existence of an edge is modeled with binary weights. An extension of modularity to 
arbitrarily weighted graphs is then straightforward. Our hardness result holds also for the 
problem of maximizing modularity in weighted graphs, as this more general problem class 
includes the problem considered in this paper as a special case. 

Our hardness result shows that there is no polynomial-time algorithm optimizing modular- 
ity unless P = NP. Recently proposed algorithms [1,2,6,8,9,11,12] are therefore incorrect 
in the sense that they yield suboptimal solutions on many instances. Furthermore, it is 
a justification to use approximation algorithms and heuristics to cope with the problem. 
Future work includes a deeper formal analysis of the properties of modularity and the 
development of algorithms with performance guarantees. 
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